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Abstract 


The computation of texture and of stereoscopic depth is limited by 
the eyes and by the subsequent stages of the visual system in 
humans, and by the quality of the optical 'front end' as well as by 
the computational hard- and software in machines. The quality of 
the optics and the resolution of the opto-electronic transducer (e.g. 
the retina) limit spatial resolution, and, consequently, the 
discrimination of textures. In stereoscopic depth, thresholds far 
below the grain of the input-device (in humans: the photoreceptor 
diameter) can be attained. This extreme accuracy in locating a 
stimulus, called hyperacuity, is due to interpolation between the 
positions of the input elements, such as the photoreceptors in 
humans. Interpolation is most likely a feat achieved by the visual 
cortex, depending on a good signal-to-noise ratio of the stimulus 
representation. Again, resolution and contrast modulation are 
critical factors. The algorithms used by the human brain to 
discriminate between textures and to compute stereoscopic depth 
are very fast and efficient. Their study might be beneficial for the 
development of better algorithms in machine vision. 
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Resolution limits of the optics and "opto-electronic interface" in 
humans. 

All visual information available to the brain is acquired through the eyes. It is 
therefore evident that properties of the optical media and of the retina impose 
limits upon visual perception — even in tasks that are thought to be primarily 
mediated by the visual cortex, such as vernier acuity and stereoscopic depth 
perception. We will review factors limiting the computation of texture and 
stereoscopic depth in humans. Firstly, the role of the optics and of the 
photoreceptors of the retina will be discussed. Next, we give a brief outline of 
the still controversial issue of texture perception and of the underlying 
computational mechanisms, followed by a description of the basic concepts of 
stereoscopic vision and hyperacuity. A number of factors that limit both the 
perception of texture and of stereoscopic depth are reviewed: the quality of the 
retinal image, blur, luminance, contrast, temporal factors, motion, stimulus area 
and retinal position. 

The optic media of the human eye limit resolution of the retinal image to around 
100 to 120 points (50-60 cycles) per degree of visual angle (Westheimer, 1960; 
Rohler, 1962; Campbell & Green, 1965; Campbell & Gubisch, 1966; Campbell 
& Robson, 1968). Hence, two points of the visual world are fused on the retina if 
their spatial separation is below 0.5 arcmin. This limit for two-point resolution is 
achieved only under the most favorable conditions, such as perfect optic media, 
an optimal pupil size, and ray-paths near the axis of the optical system (cf. e.g. 
Green, 1967; Campbell, 1974). Incidentally, the optics of the human eye 
approaches a perfect optical system for pupil sizes up to 3-4 mm, with the 
highest resolution between 2.5 and 4 mm (Rohler, 1962; Campbell & Gubisch, 
1966). 

The resolution attainable by the human retina closely matches the maximal 
resolution of the optic. In the foveolae where the photoreceptors are smallest 
and most densely packed, their spacing is around 2-3pm (Curcio et al.,1987), 
again allowing a maximal resolution of around 100-120 pixels per degree, 
corresponding to around 50 to 60 cycles per degree of a periodic pattern such 
as a sinusoidal grating. Most observers do not achieve such high resolutions — 
and this is why a resolution limit of 1 arcmin (20/20 or 1.0) is conventionally 
accepted as full vision by ophthalmologists. 

Given these limitations of the attainable spatial resolution imposed by the optics 
and by the retina, visual thresholds of around 3 arcsec or below are 
astonishing. Such thresholds are regularly obtained in a number of the so- 
called hyperacuity tasks such as vernier acuity and stereoscopic depth 
perception (Wulfing, 1892; Andrews, Butcher & Buckley, 1973; Westheimer, 
1976; cf. already Vernier, 1631). Positional information far below the photo¬ 
receptor spacing can be correctly evaluated in the visual system, while the 
detection of other image attributes is limited to a precision corresponding to the 
spacing of photoreceptors, and finer patterns appear as homogeneous gray. 
Since this paper is concerned with limits of visual perception, we shall consider 
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what the limits of texture discrimination might be and then proceed to 
stereoscopic vision. 


The computation of texture 

A texture is said to exist when part of a visual scene contains regular detail 
which is finer than the size of the surface which contains the detail. For 
example, a wooden board has a fine-grained, reasonably regular surface 
texture which is a property of the surface of the material rather than whatever 
shape is described by the outer edge of the board. Thus, texture tells us about 
surfaces, rather than shapes. 

Under most normal viewing conditions, the edge of a textured object will give 
rise to a strong luminance and/or color discontinuity. Thus, a visual system 
which ignores texture altogether may successfully detect the object contour. 
However, there are situations in which an object is camouflaged, having the 
same mean luminance and color as its background. But even if the color and 
luminance are well matched, it may be that the texture of the target is different 
from that of its immediate background. Both motion and stereoscopic depth can 
be computed from arrays of points which are correlated either between the eyes 
or between different points in time — without the requirement of prior analysis of 
complex shape. Texture is an important cue also to object recognition through 
scene segmentation by means of motion or depth gradients. Since camouflage 
is much favored by evolution, it is not surprising that visual systems have 
evolved which include texture discrimination (and detection of stereo depth from 
texture elements, as in random-dot stereograms) as part of their array of 
segmentation modules. 

We will address the basic question of how the visual system encodes texture 
and maps discontinuities in it. 

Several models of texture discrimination have been suggested. Both Marr 
(1976) and Julesz and Bergen (1983) proposed that the visual system 
evaluates the density of "textons" in a region of visual space. A texton is a 
feature extracted from the image, such as a line segment (at a given 
orientation), an elongated blob, a termination, or an intersection. The texton 
theory holds that the only important measure is texton density, and not the exact 
spatial organization of the group of textons. This "local phase insensitivity" was 
investigated for grey-level stimuli by Rentschler, Hubner, and Caelli (1988) and 
found to hold for these, as well as for the more traditional line-drawn stimuli. 
Thus, classical texture vision does not evaluate the spatial relationship between 
elements, it only responds to their number. 

Marr (1976,1982) considered an over all scheme of a primal sketch based on 
edge tokens, computing textons and performing simple statistics on them. This 
approach was extended by Voorhees and Poggio (1988) who proposed how to 
compute statistics on the textons. 




(a) (b) (c) 

Fig. 1 The top row shows three texture-discrimination stimuli recreated from 
Bergen and Adelson (1988). Image (b) is the easiest to discriminate, 
image (c) the hardest, and image (a) is intermediate. The middle row 
shows the rectified Gabor-filter outputs from Griffith's et al. (1988) 
computational model. It is apparent that image (b) gives the strongest 
figure-ground separation. The bottom row shows the texture- 
segmentation loci extracted from the three images. Image (b) gives the 
cleanest segmentation line, whereas in image (c) there are many 
spurious distractors. 


A rivaling class of theory which has been proposed for texture discrimination is 
Fourier analysis of the scene. While this had the advantage of being more 
tractable computationally than the feature-extraction models (since it is often 
difficult to extract classic textons from a cluttered natural scene), evidence 
seemed to be pointing against this approach as being the one adopted by the 
human visual system. Mayhew and Frisby (1978) as well as Julesz and Caelli 
(1979) all argue that a Fourier model does not account for observed 
performance. Griffiths, Troscianko and Knapman (1988) found that a Gabor- 
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filter model (i.e., local Fourier analysis) followed by rectification and gradient- 
based segmentation provides a good fit to the data of Mayhew and Frisby 
(1978). Secondly, there has been progress recently in developing 
computational models which achieve texture segmentation by computing 
parameters of elongated blobs (contrast, elongation, orientation) and to do 
statistics to find texture boundaries (Voorhees, 1987; Voorhees and Poggio, 
1988), and those using Gabor filters (Daugman, 1987; Griffiths et al., 1988; 
Lively and Walters, 1988). Others have used a size-tuning approach (Bergen 
and Adelson, 1988). As the computational models are being developed, so it 
seems that differences between them are eroding. For example, the elongated- 
blob model is not very different from one which used elongated blobs with 
sidelobes (Gabor filters). Both the models of Voorhees and Poggio (1988) and 
of Griffiths et al. (1988 see Fig. 1) can account for the rank order of 
discriminability of textures shown in the paper by Bergen and Adelson (1988). 

Caelli (1988) also argues that there is similarity between his adaptive model 
and the original dipole-statistics model of Julesz. So, in spite of what seem to 
be very different approaches, the differences in implementation are small. On 
pragmatic grounds, Julesz and Krose (1988) argue that simple filter theories 
should be looked at before going to complex filters. 


The computation of stereoscopic depth and hyperacuity 

While the computation of texture is limited to textures with a grain not finer than 
the grain of the input device, perceptual thresholds in the so-called hyperacuity 
tasks like vernier acuity and stereoscopic depth perception can be an order of 
magnitude lower, i.e., 3 arcsec or below. One possible explanation for such low 
thresholds was proposed by Hering (1899). He assumed positional averaging 
would take place along the edges of the stimuli. But experiments with dot stimuli 
instead of lines (Ludvigh, 1953) proved that spatial averaging along lines is not 
a necessary prerequisite. Still, the low thresholds can be explained both 
intuitively and formally. 

If the modulation transfer function of the eye’s optics were much better than it 
actually is — having a higher aperture and transmitting higher spatial 
frequencies — a point in the visual world could be imaged upon the retina as a 
point with a diameter clearly below the photoreceptor diameter. In that case, it 
would be impossible to determine the position of the point with an accuracy 
below the diameter of a photoreceptor. Projections to all parts of a 
photoreceptor would stimulate the receptor equally well. (A possible way to 
achieve transphotoreceptor accuracy in this case would be to move the point 
relative to the retina in a defined way.) Fortunately, the eye’s optical system 
does not achieve such a high resolution, but smears even the image of an 
infinitely small point over several photoreceptors according to a gaussian- 
shaped point spread function. The point spread function, i.e., the luminance 
distribution produced on the retina by a point in the outer world, has a half width 
of around 0.5 arcmin for near-axis imagery. 
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A point, projected upon the exact middle of a photoreceptor stimulates all the 
neighboring receptors by an exactly identical degree (Fig. 2a). A lateral 
displacement, even by a fraction of the photoreceptor diameter, will stimulate 
the neighboring photoreceptor on this side stronger than the one on the 
opposite side (Fig. 2b). The position of the intensity maximum (i.e., the position 
of the point) can thus be calculated with a precision far below the photoreceptor 
'diameter by comparing the relative excitations from a number of neighboring 
photoreceptors. The precision of this spatial localization is only limited by the 
signal-to-noise ratio in the system, since fluctuations in the receptor excitation 
will limit the precision and reliability of the calculation of position. 




Fig.2 The optics of the eye transforms even the smallest point into a 
luminance distribution on the retina with a half-width of at least 20 
arcsec, as indicated in the upper part. Thus, every retinal image will 
extend over several photoreceptors (receptors are schematized in the 
lower part). In (a), a bright point is located exactly on the middle of the 
photoreceptor and all surrounding photoreceptors are equally stimulated. 
If the point is moved even by a fraction of a receptor diameter, the 
neighbor on this side will be more strongly stimulated than the one on the 
opposite side (b). The ensemble of neighboring receptors can provide 
positional information beyond the receptor diameter. 


These considerations can be proven formally. Shannon (1948) showed in the 
so-called sampling theorem that any function can be reconstructed completely if 
it is sampled at a sufficiently high frequency. The sampling frequency must be at 
least slightly more than twice the highest frequency present in the signal. As 
Barlow (1979) and Crick, Marr and Poggio (1981) have pointed out, the 
conditions of the sampling theorem are met in the human eye. The signal (the 
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luminance distribution coming from objects of the outer world) is bandlimited by 
the optics of the eye to spatial frequencies below approximately 50 to 60 cycles 
per degree (as mentioned above), and moreover, the modulation of the 
frequencies at the upper end of the transmitted range is rather small. The foveal 
photoreceptor density, corresponding to slightly more than 120 receptors per 
degree, samples this luminance distribution slightly more than twice for the 
highest spatial frequencies. Hence, appropriate filtering can reconstruct the 
original luminance distribution from the information of the single photoreceptors, 
in principle, with unlimited accuracy. In practice, of course, the precision in the 
filtering or interpolation process is limited by noise at different stages of the 
system and by several characteristics of the filters (cf., e.g., Wilson, 1986). 

As Julesz (1971) has shown, stereoscopic depth is also experienced with visual 
noise, given binocular disparities between the otherwise identical images to 
both eyes. Circumscribed regions of noise sharing a common disparity are per¬ 
ceived at the same depth plane and their outline forms a shape. Stereoscopic 
vision thus allows us to perceive shapes defined by common depth and is 
another feat to break camouflage. Marr and Poggio (1976,1979) and Marr, 
Palm, and Poggio (1978) solved the underlying computational problems. 


Optical image quality, blur, luminance, and contrast. 

The cortical representation of the visual world in humans has a two-point 
resolution that is directly limited by the properties of the eye’s optical apparatus 
and by the density of retinal photoreceptors. It is also limited by the 
convergence of photoreceptors upon ganglion cells, whose axons transmit the 
information about the visual world to the geniculate body from where it is 
relayed to the visual cortex. On the other hand, positional information can be 
obtained that is far more accurate than two point resolution. The attainment of 
such high positional accuracy or hyperacuity is made possible by the failure of 
the eye's optics to transmit spatial frequencies that are high enough to cause 
false resolutions ("aliasing") in the interpolation process. 

Optimal computation of stereoscopic depth requires a sharp image, i.e., high 
spatial frequencies. The exact localization of the intensity maximum could in 
principle be calculated also from a very blurred image that contains only low 
spatial frequencies. But localization is much more difficult to achieve with shal¬ 
low luminance gradients, as on the right of Fig. 3, than with steeper ones, as on 
the left of that figure, since in the second case the same amount of intensity 
noise corresponds to a much higher positional uncertainty or error than in the 
first. Therefore, it is to be expected that thresholds for stereoscopic depth per¬ 
ception depend on the luminance and especially on the contrast and sharpness 
or spatial-frequency content of the stimuli. Hyperacuity- localization indeed 
deteriorates with decreasing contrast in vernier acuity (Foley-Fisher, 1977; 
Bradley & Skottun, 1987; Krauskopf, pers. comm.; Westheimer, pers. comm.), 
the detection of spatial discontinuities (Morgan, 1986), as well as in stereo¬ 
scopic depth perception (Lit, Finn & Vicars, 1972; Halpern & Blake, 1988; cf. 
also Frisby & Mayhew, 1978, for the stereoscopic contrast sensitivity function). 
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Fig. 3 Noise in the system with identical amplitude (e.g. receptor-noise; vertical 
arrows) leads to a larger positional error (horizontal arrows) in the de¬ 
termination of horizontal position of shallow luminance gradients (right 
side) than of steeper ones (left side). 

Spatial resolution decreases, of course, with blurring caused by imperfect 
focussing, e.g., errors in refraction of the eye. As a rule of thumb, acuity — and 
the resolution of textures — decreases by a factor of 3-4 per diopter of refractive 
error (Diepes, 1975; of. also Green & Campbell, 1965). Spatial resolution also 
depends critically upon the luminance and contrast of the test targets. Fig. 4, 
taken from Aulhorn (1964), illustrates the relation between resolution and 
luminance (cf. Ludvigh, 1941, for the effect of contrast). 



Fig. 4 Visual resolution as a function of the luminance of the test-stimuli 
increases almost linearly over a wide range to approach asymptotically 
the optimal level of performance, o-o: surround completely dark; solid 
line: data from Konig (1897); x-x: data of Aulhorn under conditions 
identical to Konigs' ; surround 10 asb (from Aulhorn, 1964). 
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As expected, thresholds for vernier discrimination (Krauskopf, pers. comm.) and 
for stereoscopic depth perception increase with blurring of the underlying retinal 
images. Stigmar (1971) measured the influence of optical image degradation, 
induced by spectacle blur, on stereoscopic depth perception and vernier acuity. 
Performance in a vernier detection and in a stereoscopic depth discrimination 
task deteriorated with blurring of the test targets — but less than spatial 
resolution did (cf. Fig. 5; and Foley-Fisher, 1977, for vernier acuity). 



Fig. 5 The influence of stimulus blur on thresholds for vernier acuity and 
stereoscopic depth perception. Thresholds are shown on the ordinate 
(arcsec'1), blur in diopters of spherical lenses on the abscissa, corres¬ 
ponding to half-widths of the stimuli between 0.5’ and 7.6'. Ai shows the 
results for abutting stimuli, A 2 shows results for stimuli with a gap (from 
Stigmar, 1971). 


Julesz (1971) determined the dependence of stereo thresholds upon blurring in 
one eye. Performance was relatively good, even with one image considerably 
blurred. Julesz offered the explanation that the low spatial frequencies in both 
retinal images suffice to elicit the impression of depth. 


Temporal factors 

The number of quanta reaching each photoreceptor increases with stimulus 
luminance and presentation time. A given amount of noise inherent in the 
visual system’s neuronal machinery will introduce a larger positional uncer¬ 
tainty for weak signals (elicited by fewer quanta) than for strong signals, and 
thresholds increase for shorter presentation times both for resolution (cf. Barlow, 
1958; Olzak & Thomas, 1986) and stereovision (Fig. 6; Ogle & Weil, 1958). The 
contrast sensitivity function, i.e., thresholds for different spatial frequencies, also 
depends strongly on temporal frequency (Kelly, 1972). 

Time is critical for fine stereoscopic depth discrimination in another respect. 
Stereoscopic depth perception is based on the computation of binocular 
disparities between simultaneously presented images of both eyes 
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(Wheatstone, 1838; Julesz, 1971). Thresholds increase with increasing 
asynchrony between the presentations of the targets to both eyes. But even 
when the presentations do not overlap in time, they may still elicit an impression 
of depth, probably mediated by some kind of visual spatial memory. This 
memory stores information for at least 200 msec, since presenting the images of 
both eyes alternately at a minimal frequency of about 5 Hz is sufficient to elicit a 
clear impression of stereoscopic depth (Guilloz, 1904; Ogle, 1963; Herzau, 
1976). Interestingly, the subjective impression of depth is less when the stimuli 
are presented alternatingly to both eyes than when they are shown 
simultaneously. The larger the asynchrony in stimulus presentation to both 
eyes, the smaller the impression of depth elicited by these stimuli. 
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Fig. 6 Thresholds for stereoscopic acuity as a function of presentation time for 
stimuli slightly in front of (upper part) or on the fronto-parallel plane 
(lower part; from Ogle & Weil, 1958). 


Retinal image motion. 

Moving an object in the outer world while the eye is stationary, or moving the 
eye while the objects in the visual world are stationary, both cause a movement 
of the stimulus across the retina. A moving target stimulates each single 
photoreceptor for only a short time. The exact duration of stimulation depends 
on the relative velocity between the motions of the eye and of the object, as well 
as on the size of the retinal photoreceptors. The photoreceptor diameter in turn 
is a function of retinal eccentricity, i.e., distance and direction from the fovea 
centralis in man. 



Resolution of moving gratings or Landolt C's decreases at velocities above 4 
deg/sec (Westheimer & McKee, 1975; Burr,1979). Poggio and Reichardt (1973) 
for flies, and Diener et al. (1975), as well as Watson (1986) for humans have 
argued that detection of moving patterns depends on the product of spatial and 
temporal frequencies of the stimulus (see, however, de Graaf, Wertheim, Bles & 
Kremers, 1990). As a consequence, lower spatial frequencies should tolerate 
higher velocities than do high spatial frequencies. Performance at high 
temporal frequencies indeed deteriorates more for high spatial frequencies than 
for low ones (Kulikowski & Tolhurst, 1973). Therefore, resolvability of moving 
textures will generally decrease with increasing speed and increasing spatial 
frequency (corresponding, broadly speaking, to smaller texture elements), the 
amount of threshold elevation depending on the given texture and its speed. 

Westheimer and McKee (1975) found that detection thresholds for vernier 
targets moving at velocities up to 4°/sec were basically unaffected (cf. also 
Morgan & Benton, 1989). This speed corresponds to a movement of the 
stimulus over approximately 500 foveal photoreceptors, leaving 2 msec of 
stimulation time for each photoreceptor. A stationary target presented for only 
11, let alone 2 msec, had a significantly higher threshold, whereas the threshold 
for a 200 msec presentation of a vernier moving at up to 4°/sec corresponds 
roughly to that of a stationary vernier (Fig. 7; Westheimer & McKee, 1975). 
Thus, stereo thresholds are better if many photoreceptors are sequentially 
stimulated for a short time (as in moving targets) than if a single photoreceptor is 
shortly stimulated (as in a shortly presented stationary stimulus). The better 
results obtained with moving targets are another indication that the visual 
system is able to pool the information emanating from different photoreceptors. 



Fig. 7 Thresholds for vernier acuity (left part) and Landolt C resolution (right 
part) as functions of target velocity (from Westheimer & McKee, 1975). 



Burr (1979) has measured thresholds for the correct identification of vernier 
offsets in continuously and discontinuously moving targets. His results are in 
close agreement with those by Westheimer and McKee (1975) in that motion 
with speeds up to 4 degrees per second does not increase thresholds in 
continuously moving targets. His results were confirmed in a study by Fahle 
and Poggio (1981) who additionally proposed a model for the pooling of 
information coming from different photoreceptors. 


Stimulus area and retinal position 

The size and information of the elements limits performance in texture discrimi¬ 
nation, and the size of the textured area can be a factor of importance, as well 
(since the smaller the area, the less the degree of polarization in Gabor-filter 
channels). In hyperacuity tasks such as vernier acuity, the size and exact 
configuration of the stimuli are critical. Extensive experiments performed in 
different laboratories suggest the existence of a spatial integration zone. This 
integration zone is around 10 arcmin wide and 20 to 30 arcmin long in both 
vernier acuity and stereoscopic vision (Westheimer & Hauske, 1975; 
Westheimer & McKee, 1977; Butler & Westheimer, 1978; Watt, Morgan & Ward, 
1983; Fahle & Kloos, unpublished). 



Fig. 8 Cone and ganglion cell density at different retinal positions (modified 
from Wassle et al., 1990). 

Thresholds for most visual tasks depend critically upon the position of the 
stimulus in the visual field, i.e., on which part of the retina is stimulated. In 
humans, the photoreceptor density is by far highest in the fovea, mean distance 




increasing by approximately a factor of 5 from the fovea to-an eccentricity of 10° 
(Curcio, 1987), and in monkeys the mean distance between ganglion cells 
seems to increase by a similar factor (Wassle et al., 1990). The lower 
photoreceptor- and ganglion cell density in the periphery causes a decrease of 
visual resolution that seems to be proportional to the decrement of the density of 
photoreceptors (Wertheim, 1894; 0sterberg, 1935; Low, 1951; Weymouth, 
'1958; van Buren, 1963; Levi, Klein & Aitsebaomo, 1985; Wassle et al., 1990; cf. 
also Rovamo & Virsu, 1979). Fig. 8 shows the decline in photoreceptor and 
ganglion cell density in primates. For eccentricities above 5 deg., resolution is 
limited by the retina rather than by the optics of the eye (Green, 1970). 
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Fig. 9 Thresholds for Landolt C resolution (a), detection of differences in 
luminance (perimetry; b), and vernier acuity (c) as functions of 
eccentricity. A naso-temporal asymmetry appears in (a) and (b), but is 
more pronounced in (c) (from Fahle & Schmid, 1988). 


The decrease of stereoscopic depth perception and vernier acuity with eccen¬ 
tricity is steeper than that of two-point resolution, both within the central part of 
the visual field (Westheimer, 1982; Fendick and Westheimer, 1983; Levi et al., 



1985), and in the periphery (Fig. 9; Fahle & Schmid, 1988), and might be better 
approximated by the decrease in ganglion ceil than in photoreceptor density (cf. 
Wassle et al., 1990). Incidentally, resolution at 30° eccentricity is around 30% 
better in the temporal than in the nasal hemifield. Hyperacuity, as measured by 
vernier acuity, shows a much stronger naso-temporal asymmetry of 200% at the 
same eccentricity (cf. Fig. 9; Fahle, 1983; Fahle & Schmid, 1988). The cortical 
projections of both eyes are similarly asymmetrical in the periphery of monkeys 
(LeVay et al., 1985), and the ganglion cells seem to be, too (Wassle et al., 
1990). 

Stereoscopic vision (and hyperacuity in general) is probably a feat of the visual 
cortex, as extensive interactions between the projections of both eyes occur 
exclusively there. Further clues for the cortical origin of hyperacuity are that only 
there, disparity sensitive neurons have been found (Poggio & Poggio, 1984), 
the existence of a kind of dichoptic vernier acuity (cf. McKee and Levi, 1987; 
Fahle, 1990), and that the optic nerve does not have a sufficient number of 
fibers to transmit explicitly-interpolated positional information as required for 
hyperacuity tasks (Fahle, 1988). 

To sum up, both the perception of texture and of stereoscopic depth are limited 
by a variety of factors determined by the input device, such as the eye, and by 
the subsequent stages of information processing in the visual system. We have 
discussed here limitations imposed by the optical apparatus of the eye and by 
the retina upon spatial resolution and hyperacuity, by stimulus luminance, 
contrast, motion, temporal factors and retinal position. 



References 


Andrews, D.P.; Butcher, A.K., and Buckley, B.R. (1973) Acuities for spatial 
arrangement in line figures: human and ideal observer compared. Vision Res. 
13: 599-620. 

Aulhorn, E. (1964) Uber die Beziehung zwischen Lichtsinn und Sehscharfe. 
Albrecht von Graefes Arch. Ophthal. 167: 4-74. 

Barlow, H.B. (1958) Temporal and spatial summation in human vision at 
different background intensities. J. Physiol. (Lond) 141, 337-350. 

Barlow, H.B. (1979) Reconstructing the visual image in space and time. 
Nature, 279: 189-190. 

Bergen, J.R., and Adelson E.H. (1988) Early vision and texture perception. 
Nature 333: 363-364. 

Bradley, A. and Skottun, B.C. (1987) Effects of contrast and spatial frequency on 
vernier acuity. Vision Res. 27: 1817-1824. 

Burr, D.C. (1979) On the visibility and appearance of objects in motion. D. Phil, 
thesis, University of Cambridge. 

Butler, T.W. and Westheimer, G. (1978) Interference with stereoscopic acuity: 
spatial, temporal, and disparity tuning. Vision Res. 18:1387 -1392 

Caelli, T. (1988) An adaptive computational model for texture segmentation 
IEEE Transactions on Systems, Man, and Cybernetics 18: 9-17. 

Campbell, F.W. (1974) The transmission of spatial information through the 
visual system. In: The Neurosciences; 3rd Study Program. Schmitt, F.O. & 
Worden, F.G. (eds), pp. 95-103. Cambridge: MIT Press. 

Campbell, F.W., and Gubisch, R.W. (1966) Optical quality of the human eye. 
J. Physiol. (Lond) 186: 558 -578. 

Campbell, F.W., and Green, D.G. (1965) Optical and retinal factors affecting 
visual resolution. J. Physiol. (Lond)181: 576-593. 

Campbell, F.W., and Robson, J.G. (1968) Application of Fourier analysis to the 
visibility of gratings. J. Physiol. (Lond.) 197: 551-566. 

Crick, F.H., Marr, D.C., and Poggio, T. (1981) An information-processing 
approach to understanding the visual cortex. In: F.O. Schmitt (Ed.) The 
organization of cerebral cortex, pp. 505-533: Cambridge, Mass., MIT-Press. 




Curcio, C.A.; Sloan, K.R.; Packer, O.; Hendrickson, A.E., and Kalina, R.E. (1987) 
Distribution of cones in human and monkey retina: Individual variability and 
radial asymmetry. Science 236: 579-582. 

Daugman, J. (1987) Image analysis and compact coding by oriented 2D Gabor 
primitives, Proc SPIE 758 (Image understanding and the man-machine 
interface): 19-30. 

de Graaf, B.; Wertheim, A.H.; Bles, W. & Kremers, J. (1990) Angular velocity, 
not temporal frequency determines circular vection. Vision Res. 30, 637-646. 

Diener, H.C.; Wist, E.R.; Dichgans, J. & Brandt, T.H. (1975) The spatial 
frequency effect on perceived velocity. Vision Res. 16, 169-176 

Diepes, H. (1975) Refraktionsbestimmung. H. Postenrieder, Pforzheim. 

Fahle, M. (1983) Naso-temporal asymmetry in visual hyperacuity. Invest. 
Ophthal. Vis. Sci. 24: 146. 

Fahle, M. (1988) A hypothesis on the localization of hyperacuity interpolation in 
the visual system. Behav. Brain Res. 33: 314. 

Fahle, M. (1990) Psychophysical measurement of eye drifts and tremor by 
dichoptic or monocular vernier acuity. Vision Res. (accepted for publication). 

Fahle, M., and Poggio, T. (1981) Visual hyperacuity: Spatiotemporal inter¬ 
polation in human vision. Proc. Roy. Soc. Lond. B 312: 451- 477, 

Fahle, M., and Schmid, M. (1988) Naso-temporal asymmetry of visual 
perception and of the visual cortex. Vision Res. 28: 293- 300. 

Fendick, M., and Westheimer, G. (1983) Effects of practice and the separation of 
test targets on foveal and peripheral stereoacuity. Vision Res. 23: 145-150. 

Foley- Fisher, J.A. (197V) Contrast, edge-gradient, and target line width as 
factors in vernier acuity. Optica Acta 24:179 -186. 

Frisby, J.P., and Mayhew, J.E.W. (1978) Contrast sensitivity function for 
stereopsis. Perception 7: 423-429. 

Green, D.G. (1967) Visual resolution when light enters the eye through different 
parts of the pupil. J.Physiol. (Lond) 190: 583-593. 

Green, D.G. (1970) Regional variations in the visual acuity for interference 
fringes on the retina. J. Physiol. (Lond) 207: 351-356. 

Green, D.G., and Campbell, F.W. (1965) Effect of focus on the visual response 
to a sinusoidally modulated spatial stimulus. J. Opt. Soc. Am. 55: 1154- 1157. 



Griffiths, E., Troscianko, T., and Knapman, J. (1988) A computational model of 
texture perception. Perception 17: 356. 

Guilloz, T. (1904) Sur la stereoscopie obtenue par les visions consecutives 
d'images monoculaires. Comp. rend. Soc.biol. 56: 1053-1054. 

Halpern, D.L., and Blake, R.R. (1988) How contrast affects stereoacuity. 
Perception 17: 483-495. 

Hering, E. (1899) Ueber die Grenzen der Sehscharfe. Ber. Math. Phys. Classe 
d. konigl. Sachs. Gesellschaft d. Wissensch. Leipzig: 16-24. 

Herzau, V. (1976) Stereosehen bei alternierender Bilddarbietung. Graefes 
Arch. Ophthal. 200: 85-91. 

Julesz, B. (1971) Foundations of cyclopean perception. University of Chicago 
Press. 

Julesz, B., and Bergen, J.R. (1983) Textons, the fundamental elements in 
preattentive vision and perception of textures. Bell System Technical Journal 
62: 1619-1645. 

Julesz, B., and Caelli, T. (1979) On the limits of Fourier decompositions in 
visual texture perception. Perception 8: 69-73. 

Julesz, B., and Krose B. (1988) Features and spatial filters. Nature 333: 302- 
303. 

Kelly, D.H. (1972) Adaptation effects on spatio-temporal sine-wave thresholds. 
Vision Res. 12: 89-101. 

Konig, A. (1897) Die Abhangigkeit der Sehscharfe von der Beleuchtungs- 
intensitat. Sitz. Ber. konigl. preuss. Akad. Wiss. Berlin I.Halbb.: 559-572. 

Kulikowski, J.J., and Tolhurst, D.J. (1973) Psychophysical evidence for 
sustained and transient detectors in human vision. J. Physiol. (Lond) 232: 149- 
162. 

LeVay, S.; Connolly, M., Houde, J., and vanEssen, D.C. (1985) The complete 
pattern of ocular dominance stripes in the striate cortex and visual field of the 
macaque monkey. J. Neuroscience 5: 486-501. 

Levi, D.M., Klein, S.A., and Aitsebaomo, A.P. (1985) Vernier acuity, crowding, 
and cortical magnification. Vision Res. 25: 963-977. 

Lit, A.; Finn, J.P., and Vicars, W.M. (1972) Effect of target-background 
luminance contrast on binocular depth discrimination at photopic levels of 
illumination. Vision Res. 12: 1241-1251. 



Lively, R., and Walters, D. (1988) integration of detector responses for texture 
segmentation. Proc. SPIE 937 (Applications of artificial intelligence Vi): 86-93. 

Low, F.N. (1951) Peripheral visual acuity. AMA Arch. Ophthal. 45: 80- 99. 

Ludvigh, E. (1941) Effect of reduced contrast on visual acuity as measured with 
snellen test letters. Arch. Ophthalmol. 25: 469-474. 

Ludvigh, E. (1953) Direction sense of the eye. Am. J. Ophthal. 36: 139-142. 

McKee, S.P., and Levi, D.M. (1987) Dichoptic hyperacuity: the precision of 
nonius alignment. J. Opt. Soc. Am. A 4: 1104-1108. 

Marr, D. (1976) Early processing of visual information. Phil. Trans. Roy. Soc. 
Lond. B 275: 483-524. 

Marr, D. (1982) Vision. W.H. Freeman, San Francisco. 

Marr, D. and Poggio, T. (1976) Cooperative computation of stereo disparity. 
Science 194: 283-287. 

Marr, D., and Poggio, T. (1979) A computational theory of human stereo vision. 
Proc. Roy. Soc. Lond. B 204: 301-328. 

Marr, D., Palm, G., and Poggio, T. (1978) Analysis of a cooperative stereo 
algorithm. Biol. Cybernetics 28, 223-239. 

Mayhew, J.E.W., and Frisby, J.P. (1978) Texture discrimination and Fourier 
analysis in human vision. Nature 275: 438-439. 

Morgan, M.J. (1986) The detection of spatial discontinuities: interactions 
between contrast and spatial continuity. Spatial Vision 1: 291 -303. 

Morgan, M.J., and Benton, S. (1989) Motion-deblurring in human vision. Nature 
340: 385 -386 

Ogle, K.N. (1963) Stereoscopic depth perception and exposure delay between 
images to the two eyes. J. Opt. Soc. Am. 53:1296- 1304 

Ogle, K.N., and Weil, M.P. (1958) Stereoscopic vision and the duration of 
stimulus. Arch. Ophthal. 59: 4-17. 

Olzak, L.A., and Thomas, J.P. (1986) Seeing spatial patterns. In: K.R. Boff, L. 
Kaufman, and J.P. Thomas (eds) Handbook of perception and human 
performance I, Sensory processes and perception; Chapt. 7. New York: Wiley. 

Osterberg, G. (1935) Topography of the layers of rods and cones in the human 
retina. Acta opthal. (kbh) 13, Suppl. 6: 1- 102 




Poggio, G., and Poggio, T. (1984) The analysis of stereopsis. Ann. Rev. 
Neurosci. 7: 379-412 

Poggio, T. & Reichardt, W. (1973) Considerations on models of movement 
detection. Kybemetik 13, 223-227. 

Rentschler, I., Hiibner, M., and Caelli, T. (1988) On the discrimination of 
compound Gabor signals and textures. Vision Res. 28: 279-291. 

Rohler, R. (1962) Die Abbildungseigenschaften der Augenmedien. Vision Res. 
2: 391-429. 

Rovamo, J., and Virsu, V. (1979) An estimation and application of the human 
cortical magnification factor. Exp. Brain Res. 37: 495-510 

Shannon, E.C. (1948) A mathematical theory of communication. Bell Syst Tech. 
J. 27: 623- 656. 

Stigmar, G. (1971) Blurred visual stimuli II. The effect of blurred visual stimuli on 
vernier and stereo acuity. Acta Ophthal. 49: 364- 379. 

van Buren, J.M. (1963) The retinal ganglion cell layer. C.C. Thomas, 
Springfield, III. 

'Vernier, Pierre" by E. A. Avallone, In: The Encyclopedia Americana, 
International Edition (Americana, New York, 1976); p 38 

Voorhees, H. (1987) Finding texture boundaries in images. MIT Al Technical 
Report 968, June 1987. 

Voorhees, H., and Poggio, T. (1988) Computing texture boundaries from 
images. Nature 333: 364-367. 

Wassle, H., Grunert, U., Rohrenbeck, J., and Boycott, B.B. (1990) Cortical 
magnification factor, spatial resolution, and retinal ganglion cell density in the 
primate. Vision Res. (in the press). 

Watson, A.B. (1986) Temporal sensitivity. In: K.R. Boff, L. Kaufman, and J.P. 
Thomas (eds) Handbook of perception and human performance I, Sensory 
processes and perception; Chapt. 6. New York: Wiley. 

Watt, R.J.; Morgan, M.J., and Ward, R.M. (1983) The use of different cues in 
vernier acuity. Vision Res. 23: 991- 995. 

Wertheim, T. (1894) Uber die indirekte Sehscharfe. Z. Psychol. Physiol. 
Sinnesorg. 7: 172-187. 

Westheimer, G. (1960) Modulation thresholds for sinusoidal light distributions 
on the retina. J. Physiol. (Lond) 152: 67-74. 



20 


Westheimer, G. (1976) Diffraction theory and visual hyperacuity. Am. J. Optom. 
Physiol. Opt. 53: 362-364. 

Westheimer, G. (1982) The spatial grain of the perifoveal visual field. Vision 
Res. 22:157-162. 

Westheimer, G., and Hauske, G. (1975) Temporal and spatial interference with 
vernier acuity. Vision Res. 15:1137-1141. 

Westheimer, G., and McKee, S.P. (1975) Visual acuity in the presence of retinal 
image- motion. J. Opt. Soc. Am. 65: 847- 850. 

Westheimer, G., and McKee, S.P. (1977) Integration regions for visual hyper¬ 
acuity. Vision Res. 17: 89- 93. 

Weymouth, F.W. (1958) Visual sensory units and the minimal angle of 
resolution. Am. J. Ophthal. 46: 102-113. 

Wheatstone, C. (1838) Contributions to the physiology of vision. Phil. Trans Roy 
Soc. Lond. : 371-394. 

Wilson, H.R. (1986) Responses of spatial mechanisms can explain hyperacuity. 
Vision Res. 26: 453 -469. 

Wulfing, E. A. (1892) Ueberden kleinsten Gesichtswinkel. Z Biol. 29, 199-202. 



